When you open RStudio, you will see 3 different windows along with a number of tabs:
This is the R console, where you key in commands to be run in an interactive fashion. Type in your command and hit the Enter key. Once you hit the Enter key, R executes your command and prints the result, if any.
ls()
.functionName
appears here when you type ?functionName
in the console.There isn’t anything here at the moment, but this space will become useful later when we are working with scripts. Click the icon in the top-left corner of the window, and click “R Script”. A new window pane that looks like a text editor opens up.
We’ll explore scripts later in the course, but for now, this is a useful place for us to type out long commands (especially those which span over multiple lines). To execute code from this window, highlight the code and click the button at the top of the window (or Cmd-Enter
on a Mac, Ctrl-Enter
on Windows).
You can use R has a high-powered calculator. For example,
1 + 2
## [1] 3
456 * 7
## [1] 3192
5 / 2
## [1] 2.5
Notice that the command 5/2
gave the result 2.5
, while several other programming languages would typically give 2
as a result.
There are several math functions which come with R. For example, to evaluate \(log (e^{25} - 2^{\sin(\pi)})\), we would type
log(exp(25) - 2^(sin(pi)))
## [1] 25
Apart from numbers, R supports a number of different “types” of variables. The most commonly used ones are numeric variables, character variables (i.e. strings), factor variables, and boolean (or logical) variables. (We’ll talk about factors in Session 2.)
We can check the type of a variable by using the typeof
function:
typeof("1")
## [1] "character"
typeof(TRUE)
## [1] "logical"
We can change the type of a variable to type x
using the function as.x
. This process is called “coercion”. For example, the following code changes the number 6507232300
to the string "6507232300"
:
as.character(6507232300)
## [1] "6507232300"
typeof(6507232300)
## [1] "double"
typeof(as.character(6507232300))
## [1] "character"
We can also change variables to numbers or boolean variables.
as.numeric("123")
## [1] 123
as.logical(123)
## [1] TRUE
as.logical(0)
## [1] FALSE
Sometimes type conversion might not work:
as.numeric("def")
## Warning: NAs introduced by coercion
## [1] NA
Sometimes type conversion does not work as you might expect. Always check that the result is what you want!
as.logical("123")
## [1] NA
Often, we want to store the result of a computation so that we can use it later. R allows us to do this by variable assignment. Variable names must start with a letter and can only contain letters, numbers, _
and .
.
The following code assigns the value 2
to the variable x
:
x <- 2
Do not use the =
sign to assign values to variables! Although it works in R, it can cause a lot of confusion.
Notice that no output was printed. This is because the act of variable assignment doesn’t produce any output. If we want to see what x
contains, simply key its name into the console:
x
## [1] 2
For more complex objects that will encounter soon, we can use the str
function to get information on the internal structure of the object:
str(x)
## num 2
We can use x
in computations:
x^2 + 3*x
## [1] 10
We can also reassign x
to a different value:
x <- x^2
x
## [1] 4
What is the value of x
and y
after I execute the following code?
y <- x
x <- x^2
Let’s add a third variable:
z <- 3
Note that we now have 3 entries in our Environment tab. To remove an object/variable, use the rm()
function:
rm(x)
To remove more than one object, separate them by commas:
rm(y, z)
Let’s add the 3 variables back again:
x <- 1; y <- 2; z <- 3
To remove all objects at once, use the following code:
rm(list = ls())
For data analysis, we often have to work with multiple values at the same time. There are a number of different R objects which allow us to do this.
The vector is a 1-dimensional array whose entries are the same type. For example, the following code produces a vector containing the numbers 1,2 and 3:
vec <- c(1, 2, 3)
vec
## [1] 1 2 3
Just as we had the as.x
functions to coerce variables to type x
, R has is.x
functions to check if a variable is of type x
.
is.vector(vec)
## [1] TRUE
Typing out all the elements can be tedious. Sometimes there are shortcuts we can use. The following code assigns a vector of the numbers 1 to 100 to vec
:
vec <- 1:100
vec
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
## [18] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
## [35] 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
## [52] 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
## [69] 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
## [86] 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
What if I only want even numbers from 1 to 100 (inclusive)? We can manipulate vectors using arithmetic operations (just like numbers). Note that arithmetic operations happen element-wise.
even <- 1:50 * 2
even
## [1] 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
## [18] 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68
## [35] 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100
We can also get the odd numbers:
odd <- even - 1
We can use the c()
function to combine (“concatenate”) several small vectors into one large vector. How many elements does the vector z
have?
z <- 1:5
z <- c(z, 3, z)
R allows us to access individual elements in a vector. Unlike many other programming languages, indexing begins at 1, not 0. For example, to return the first even number, I would use the following code:
even[1]
## [1] 2
We can get multiple elements of a vector as well. The following code extracts the 5th to 9th even number (inclusive), and assigns it to the variable y
:
y <- even[5:9]
y
## [1] 10 12 14 16 18
This extracts just the 3rd and 5th even numbers:
even[c(3,5)]
## [1] 6 10
What if I want all even numbers except the first two? I can use negative indexing to achieve my goal:
even[-c(1,2)]
## [1] 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
## [18] 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72
## [35] 74 76 78 80 82 84 86 88 90 92 94 96 98 100
Use the length
function to figure out how many elements there are in a vector. What happens if I try to extract an element from an index greater than its length?
length(odd)
## [1] 50
odd[51]
## [1] NA
One last note about vectors: the elements in a vector have to be of the same type. How do you think R gets the result for the code below?
c(1, 2, "a")
## [1] "1" "2" "a"
Matrices are just the 2-dimensional analogs of vectors while arrays are the \(n\)-dimensional analogs of vectors. We won’t be talking about them a whole lot in this class. As with vectors, elements of matrices and arrays have to be of the same type.
Use the matrix()
command to change a vector into a matrix:
A <- matrix(LETTERS, nrow = 2)
A
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] "A" "C" "E" "G" "I" "K" "M" "O" "Q" "S" "U" "W" "Y"
## [2,] "B" "D" "F" "H" "J" "L" "N" "P" "R" "T" "V" "X" "Z"
Notice that R takes the elements in the vector you give it and fills in the matrix column by column. If we want the elements to be filled in by row instead, we have to put in a byrow = TRUE
argument:
B <- matrix(letters, nrow = 2, byrow = TRUE)
To get the dimensions of the matrix, we can use the dim
, nrow
and ncol
functions.
To access the element in the i
th row and j
column for the matrix B
, use the index i,j
:
B[1, 2] # for the element in the 1st row and 2nd column
## [1] "b"
What do you think A[2,]
returns? how about A[,2]
?
In all the data structures so far, the elements have to be of the same type. To have elements on different types in one data structure, we can use a list, which we create with list()
. We can think of a list as a collection of key-value pairs. Keys should be strings.
person <- list(name = "John Doe", age = 26)
person
## $name
## [1] "John Doe"
##
## $age
## [1] 26
The str
function can be used to inspect what is inside person
:
str(person)
## List of 2
## $ name: chr "John Doe"
## $ age : num 26
To access the name
element person
, we have 2 options:
person[["name"]]
## [1] "John Doe"
person$name
## [1] "John Doe"
The elements of a list can be anything, even another data structure! Let’s add the names of John’s children to the person
object:
person$children = c("Ross", "Robert")
str(person)
## List of 3
## $ name : chr "John Doe"
## $ age : num 26
## $ children: chr [1:2] "Ross" "Robert"
To see the keys associated with a list, use the names()
function:
names(person)
## [1] "name" "age" "children"
This section is for documentation purposes: By displaying my session info, others who read this document will know what the system set-up was when I ran the commands above.
sessionInfo()
## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] compiler_3.5.1 backports_1.1.2 magrittr_1.5 rprojroot_1.3-2
## [5] tools_3.5.1 htmltools_0.3.6 yaml_2.1.19 Rcpp_0.12.17
## [9] stringi_1.2.3 rmarkdown_1.10 knitr_1.20 stringr_1.3.1
## [13] digest_0.6.15 evaluate_0.10.1